36 research outputs found

    Class-based probability estimation using a semantic hierarchy

    Get PDF
    This article concerns the estimation of a particular kind of probability, namely, the probability of a noun sense appearing as a particular argument of a predicate. In order to overcome the accompanying sparse-data problem, the proposal here is to define the probabilities in terms of senses from a semantic hierarchy and exploit the fact that the senses can be grouped into classes consisting of semantically similar senses. There is a particular focus on the problem of how to determine a suitable class for a given sense, or, alternatively, how to determine a suitable level of generalization in the hierarchy. A procedure is developed that uses a chi-square test to determine a suitable level of generalization. In order to test the performance of the estimation method, a pseudo-disambiguation task is used, together with two alternative estimation methods. Each method uses a different generalization procedure; the first alternative uses the minimum description length principle, and the second uses Resnik's measure of selectional preference. In addition, the performance of our method is investigated using both the standard Pearson chi-square statistic and the log-likelihood chi-square statistic

    Accounting for Uncertainty in Ecological Analysis: The Strengths and Limitations of Hierarchical Statistical Modeling

    Get PDF
    Copyright by the Ecological Society of America.Analyses of ecological data should account for the uncertainty in the process(es) that generated the data. However, accounting for these uncertainties is a difficult task, since ecology is known for its complexity. Measurement and/or process errors are often the only sources of uncertainty modeled when addressing complex ecological problems, yet analyses should also account for uncertainty in sampling design, in model specification, in parameters governing the specified model, and in initial and boundary conditions. Only then can we be confident in the scientific inferences and forecasts made from an analysis. Probability and statistics provide a framework that accounts for multiple sources of uncertainty. Given the complexities of ecological studies, the hierarchical statistical model is an invaluable tool. This approach is not new in ecology, and there are many examples (both Bayesian and non-Bayesian) in the literature illustrating the benefits of this approach. In this article, we provide a baseline for concepts, notation, and methods, from which discussion on hierarchical statistical modeling in ecology can proceed. We have also planted some seeds for discussion and tried to show where the practical difficulties lie. Our thesis is that hierarchical statistical modeling is a powerful way of approaching ecological analysis in the presence of inevitable but quantifiable uncertainties, even if practical issues sometimes require pragmatic compromises

    Improved multivariate prediction under a general linear model

    Get PDF
    Assuming a general linear model with known covariance matrix, several linear and nonlinear predictors are presented and their properties are discussed. In the context of simultaneous multiple prediction, a total sum of squared errors is suggested as a loss function for comparing predictors. Based on a rundamental relationship hetween prediction and estimation, a very general class of predictors is developed from which predictors with uniformly smaller risk than that of the classical best linear unbiased (i.e., universal kriging) predictor can be constructed. 1993 Academic Press. All rights reserved

    A spatial analysis of variance applied to soil-water infiltration

    Get PDF
    A spatial analysis of variance uses the spatial dependence among the observations to modify the usual interference procedures associated with a statistical linear model. When spatial correlation is present, the usual tests for presence of treatment effects may no longer be valid, and erroneous conclusions may result from assuming that the usual F ratios are F distributed. This is demonstrated using a spatial analysis of soil-water infiltration data. Emphasis is placed on modeling the spatial dependence structure with geostatistical techniques, and this spatial dependence structure is then used to test hypotheses about fixed effects using a nested linear model. -Author

    Spatial prediction from networks

    No full text
    This article defines a random-field model that can be used for the prediction of pollutants at locations where no data are available, based on data taken from a spatial network of monitoring sites. Acid deposition data collected from the UAPSP network in 1982 and 1983, are analyzed in two stages. Bias-resistant and outlier-resistant techniques are used to determine the spatial dependence; then a spatial model is built that is made up of a quadratic trend surface and the spatially correlated error. Spatial sampling plans and optimal designs for selecting monitoring sites are summarized and discussed. The question of the location of additonal sites (and deletion of existing ones) is also addressed
    corecore